Can machine translation systems be evaluated by the crowd alone

Y Graham; T Baldwin; A Moffat; J Zobel

Journal article

Can machine translation systems be evaluated by the crowd alone

Y Graham, T Baldwin, A Moffat, J Zobel

Natural Language Engineering | Published : 2017

DOI: 10.1017/S1351324915000339

Download PDF

Abstract

Crowd-sourced assessments of machine translation quality allow evaluations to be carried out cheaply and on a large scale. It is essential, however, that the crowd's work be filtered to avoid contamination of results through the inclusion of false assessments. One method is to filter via agreement with experts, but even amongst experts agreement levels may not be high. In this paper, we present a new methodology for crowd-sourcing human assessments of translation quality, which allows individual workers to develop their own individual assessment strategy. Agreement with experts is no longer required, and a worker is deemed reliable if they are consistent relative to their own previous work. ..

View full abstract

University of Melbourne Researchers

Tim Baldwin Author

Alistair Moffat Author

Justin Zobel Author

Related Projects (1)

Principles, Practice, and Pragmatics of Measurement in Experimental Computer Science

The project team's confidence in scientific knowledge is partly due to the robustness of the systems of measurement used in experiments. The..

Grants

Awarded by Australian Research Council

Funding Acknowledgements

This work was supported by the Australian Research Council's Discovery Projects Scheme (grant DP110101934) and Science Foundation Ireland through the CNGL Programme (Grant 12/CE/I2267) in the ADAPT Centre (www.adaptcentre.ie) at Trinity College Dublin.